2 : 1 4PM : PR I EST LAW OFFICES 



;919 806 1 690 



Appl. No. 09/845,561 
Amdt dared March 5. 2004 
Reply to Office Action of March I, 2004 

Amendments to the Claims: 

This listing of claims will replace all prior versions, and listings, of claims in the 
application: 
Listing of Claims: 

Please cancel claims 25, 26, 29, and 30 without prejudice. 

1 . (original): A method of modeling phenomena compri sing the steps of: - 
creating a set of tags, each tag controlling one or more aspects of one or more ! . 

phenomena; 

A arranging selected members of the set of tags in a desired sequence to produce 

phenomena as defined by the sequence of tags; and 

processing the tags in order to produce phenomena having the characteristics defined by 
the tags. 

2. (original): The method of claim 1 wherein the phenomena controlled by the 
tags are characteristics of speech, wherein the step of arranging selected members of the tags in a 
desired sequence comprises placing the selected members of the set of tags into a body of text | 

I 

and wherein the step of processing the tags comprises processing the body of text and the tags to 
produce speech having characteristics defined by the tags. ^ 

3. (original): The method of claim 2 wherein the characteristics of speech are ^. 
prosodic characteristics of speech. 

4. (original): The method of claim 3 wherein each tag imposes a constraint on 
the prosodic characteristics of speech affected by the tag. 
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5. (original): The method of claim 4 wherein each of the tags specifies an action 
to be taken and includes parameters defining attributes and associated values providing 
information about the action to be taken. 

6. (original): The method of claim 5 wherein each of the tags may include a 
parameter specifying the location at which the tag takes effect. 

7. (original): The method of claim 6 wherein the set of tags includes tags which 
establish settings which remain unchanged until altered by a subsequent tag. 

8. (original): The method of claim 7 wherein the set of tags includes members 
which define the pitch behavior of speech over the course of a phrase. 

9. (original): The method of claim 8 wherein the set of tags includes tags 
defining accents which define the pitch behavior of local influences within a phrase. 

1 0. (original): The method of claim 6 wherein the set of tags includes tags 
defining phrase boundaries which mark boundaries between regions at which tags have effect. 

1 1 . (original): The method of claim 10 wherein a tag which defines a phrase 

boundary prevents tags following the tag which marks the boundary fix^m influencing speech | 
components preceding the tag which marks the boimdary. 

12. (original): The method of claim 9 wherein each of the tags may include values t 
defining type and strength in order to define interaction of the tag with other tags. | 

■ . . 6: 

1 3 . (original): The method of claim 1 2 wherein a tag may compromise its shape, 
average pitch or both depending on the value defining type. 
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14. (original): The method of claim 8 wherein the step of processing the tags 
includes establishing a phrase curve by creating and solving equations defined by tags which 
specify changes in pitch and tags which specify rates of changes in pitch. 

15. (original): The method of claim 14 wherein the body of text and the tags are 
processed one minor phrase at a time. 

1 6. (original): The method of claim 1 5 wherein processing of a phrase includes 
using values describing properties prevailing near the end of an immediately preceding phrase. 

1 7. (original) : The method of claim 9 wherein the step of processing the tags 
includes establishing a pitch curve by creating and solving equations defined by tags which 
specify accents. 



\ 1 8. (original): The method of claim 17 wherein the body of text and the tags are 

processed one minor phrase at a time. 

1 9, (original): The method of claim 1 8 wherein processing of a phrase includes 
using values describing properties prevailing near the end of an immediately preceding phrase, 

20. (original): A method of processing a body of text including tags defining 
prosodic characteristics of speech to be produced by processing the text, comprising the steps of: 

extracting the tags fix)m the text; 
creating a set of equations defining a phrase curve; 
solving the set of equations to produce the phrase curve; 
creating a set of equations defining a pitch curve; 
solving the set of equations to produce the pitch curve; 

4 
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moping linguistic concepts represented by the phrase curve and the pitch curve to 
acoustical observables; and 

performing a nonlinear transformation to adjust the prosodic characteristics defined by 
tags to human perceptions and expectations. 

2 1 . (original): A method of defining a set of tags ^ecifying prosodic 
characteristics of a target speaker, comprising the steps of: 

selecting a body of training text; 

receiving speech representing reading of the training text by the target speaker to form a 




training corpus; 

analyzing the training corpus to identify prosodic characteristics of the training corpus; 



and 

creating a set of tags defining the identified prosodic characteristics of the training corpus. 

22. (original): A method of placing tags in text for text to speech processing 
comprising the steps of: 

placing tags in a body of training text to model prosodic characteristics of a training 
corpus produced by reading of the training text; 

analyzing the placement of the tags in the training text to develop a set of rules for 
placement of tags in text; and 

applying the rules to text for which text to speech processing is desired to place tags in 
the text in order to produce speech having desired prosodic characteristics. 
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23. (original): A text to speech system for receiving text inputs comprising text to 
be processed to generate speech and tags defining prosodic characteristics of the speech to be 
generated, comprising: 

a text input interface for receiving the text input; 

a speech modeler operative to process the text inputs to produce speech having the 
prosodic characteristics specified by the tags; and 

a speech output interface for producing the speech output. 

24. (original): The system of claim 23 wherein the speech modeler is further 
operative to process a training corpus representing a reading of text by a target speaker to 
produce tags defining prosodic characteristics of the training corpus and use the tags to produce 
speech having prosodic characteristics typical of the target speaker. 

25. (canceled) 

26. (canceled) 

27. (original): The method of claim 2 wherein each tag imposes a constraint on 
motion of an articulator used to produce speech. 

28. (original): The method of claim 1 wherein each tag imposes a constraint on 
modeled muscular motions used to simulate gestures or facial expression. 

29. (canceled) 

30. (canceled) 

3 1 . (original): The method of claim 9 wherein one or more tags are placed within 
a proper noim comprising two or more words, each such tag producing prosody indicating to a 



6 



• RCVD AT 3«/2004 1 :14:38 PM [Eastern Standard TInw] • SVR:USPTaeFXRF-1M * DNIS:872&314 * CSID:9ie 806 1690 * DURATION (mm-ss): 03-32 



3- 5-04 ; 2 : t -4PM ; PR I EST LAYOFF I CES ;919 806 i 690 « 12/ 13 



Appl. No. 09/845,561 

Amdt. dated March 5, 2004 

Reply to Office Action of Ndarch 1 . 2004 

listener that the proper noun is to be interpreted as a single entity rather than as more than one 
entity. 

32. (original): The method of claim 31 wherein the tag produces an increase in 
the pitch and speed of speech over the speech affected by the tag, 

33. (original): The method of claim 9 wherein one or more tags are placed to 
produce a word having prosody indicating that the word requires confirmation. 

34. (original): The method of claim 33 wherein the prosody indicating that the 
word requires confirmation is characterized by a relatively high and increasing pitch across the 
word requiring confirmation. 
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